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We compute absolute binding affinities for two ligands bound to the FKBP protein using non- 
equilibrium unbinding simulations. The methodology is straight-forward, requiring little or no 
modification to many modern molecular simulation paclcages. The approach makes use of a physical 
pathway, eliminating the need for complicated alchemical decoupling schemes. Results of this study 
are promising. For the ligands studied here the binding affinities are typically estimated within 
less than 4.0 kj/mol of the target values; and the target values are within less than 1.0 kj/mol 
of experiment. These results suggest that non-equilibrium simulation could provide a simple and 
robust means to estimate protein- ligand binding affinities. 



I. INTRODUCTION 

The accurate estimation of binding affinities for 
protein-ligand systems (AG) remains one of the most 
challenging tasks in computational biophysics and 
biochemistry^. Due to the high computational cost of 
such free energy computation, it is of interest to un- 
derstand the advantages and limitations of various AG 
methods. 

Many previous studiecl^'^'^'^'^'^™^"'"'"'"'^^'^^'^^'"'^*^'^ 

have calculated protein-ligand binding affinities using 
equilibrium free energy methods such as thermody- 
namic integrationPS, free energy perturbatiorPsED^ g^j^^j 
weighted histogram analysi^^. Due to the introduction 
of the novel Jarzynski approacbP^ it is also possible 
to estimate AG from non-equilibrium simulations. 
However, the estimation of AG for protein-ligand bind- 
ing using non-equilibrium approaches remains largely 
untested. Two recent studies used n on-eq uilibrium 
simulations to test unbinding pathways^^Eni. Stud- 
ies by other groups found that estimating AG via 
non-equilibrium simulati ons r esulted in a large error 
compared to experimeniPi^. A recent study by 
Kuyucak and collaborators demonstrated that use of 
non-equilibrium simulation required longer simulation 
times than umbrella sampling^^l. 

In this report, apparently for the first time, we demon- 
strate the ability to compute accurate (as compared to 
experimental data) protein-ligand binding AG estimates 
following a non-equilibrium methodology. The approach 
relies on performing multiple non-equilibrium unbind- 
ing simulations using a physical pathway, i.e., pulling 
the ligand out of the binding pocket, and then uses 
the Jarzynski relatiorP^ to estimate AG. The system is 
an FKBP protein complexed with 4-hydroxy-2-butanone 
(BUQ) and dimethyl sulfoxide (DMSO). The motivation 
for using this system is that comparison to experiment 
is possibl^^ and m any previous computational studies 
have been performecP^E^n^MIlI], 

The importance of pursuing non-equilibrium methods 
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such as used in this report is three-fold: (i) The approach 
is trivially parallelizeable since each non-equilibrium un- 
binding simulation is performed independently, (ii) The 
method is simple to implement in many existing simu- 
lation packages such as GROMAC^^, used here; little 
or no modification to the code is necessary, (iii) Since 
a physical pathway is utilized, there is no need to use 
alchemical decoupling schemes. 

This study represents the first stage of a project com- 
2Q:im2fl|3bhe efficiencies of various free energy methods 
for protein-ligand AG computation. We note that effi- 
ciency studies have bee n carried out for other non-protein 



II. THEORY 

In general, the absolute binding affinity is defined as 
the free energy difference between the unbound (apo) and 
bound (holo) states of the protein-ligand system. We de- 
fine the apo state as when the protein and ligand are 
not interacting due to a large separation between them. 
The holo state is defined to be when the ligand is in the 
binding pocket of the protein. Experimentally, the bind- 
ing affinity is measured by determining the equilibrium 
constant Kcq ~ [PL]/[P][L], where [PL] denotes the con- 
centration of the protein-ligand complex, and [P] and [L] 
are the concentrations of the apo protein and free ligand, 
respectively. Then the absolute binding affinity is given 
by AGbind = —ksT ln{Keq/Vo) , where fc^ is the Boltz- 
mann constant, T is the system temperature, and Vq is 
the standard volume used for the experiment (typically 
Vq = 1.661 nm^ corresponding to 1.0 mol/liter concen- 
tration). 

Following the notation of Roux and coUaboratorJi^ESl 
(also sec discussion in Refs. .1.4.10) the equilibrium con- 
stant is given by a ratio of integrals over the apo and holo 
regions of configurational space 



Kcq = 



(1) 



where (3 = l/fc^T, x represents the configurational coor- 
dinates of the ligand, X are the coordinates of the protein 
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FIG. 1: The coordinate system used for the restraints 
Ua,{9,(f)) and Ui{r,t). The value of r is given by the center 
of mass separation between the protein and hgand. A1-A4 
are the heavy atoms used to define the coordinate system 
for ?7a. For the FKBP-DMSO system Al = DMS0:S1, A2 
= TRP59:N, A3 = HIS25:N and A4 = ALA64:N. For the 
FKBP-BUQ system Al = BUQ:C2, and A2-A4 are the same 
as FKBP-DMSO. 



where strategies for computing the terms Ii and I2 will 
be discussed below. 

The first term Ii = e^^^'^'^ in Eq. ^ corresponds 
to the free energy difference associated with restraining 
the protein to the binding axis while in the holo state. 
This free energy can be computed using any standard 
technique by performing simulations for a range of force 
constants from to ka- For the current study we chose 
to compute this free energy difference via a multi-stage 
Bennett approadP^. 

To determine the second term I2 in Eq. ^ we define 
the potential of mean force (PMF) W, with the restraint 
potential C/a present, as a function of the scalar distance 
r 



and solvent, and U{x,X) is the potential energy of the 
system. The vector r defines the location of the center 
of mass of the ligand relative to the center of mass of the 
protein (see Fig. [T|, and ?% is a reference value taken to 
be when the ligand and protein are not interacting. 

Equation ([ij can be used to compute binding affini- 
ties using various computational strategies. In our study, 
we restrain the ligand relative to the protein so that 
the ligand remains along the binding axis. The poten- 
tial energy of this axial restraint is given by J7a(0, — 
\ka \{Q — Oq)^ + {4>~ 4'o)^] , where ka is the force constant 
and 9o, (j)o are reference values of the coordinates; see Fig. 
[1] With this restraint defined, Eq. ([T]) can be written as 
a product of dimensionless ratios of integrals 



-/3AGb 



K, 



eg 



h = 



Vo J dxS{f-n)JdXe-Pu' 



(2) 



'/3[W(r2)-W(ri)] 



/df (5(r-r2)/rfXe-^(^+^-) 
J dx S{r - n) J dX e-l^(u+u.) 



(3) 

Integrating the PMF over both apo and holo regions we 
can obtain 



dfi S{fi — 7%) e ' - ; a/-2 

apo ./holo 



1 



dr2 e-''['^(''2)-'^(''i)l 
^dxS{f-f,)JdX e-PU ' 



(4) 



where we have used the fact that (5(r— r*) = 47rr^ 5{f—r^) 
for the apo integral since the PMF is independent of the 
direction of f when the ligand is not interacting with 
the protein. Thus I2 can be evaluated by estimating the 
integral of the PMF in Eq. Q , 



l2 = ^ I dn 5{n - n) e'/^^^ / dr2 e-'5[^('^^)-^('-i)l 



^0 J apo holo 

47rr 



Idr, cos 0,d0,d<t>i ^-PU.iSM g+;3W(.,) f ^-PWir.) 



apo ^" ' 1 J holo 



"le+P^^^-^ j cos0id^id(Aie"''^»(^-^^' / drse-'^^f'-^). (5) 

'''^0 J apo J holo 
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Below the apo integral will be evaluated analytically and the holo integral will be evaluated using quadrature. 
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With our approximations above the absolute binding 
free energy can now be estimated via the relation 



AG, 



bind 



AG 



holo 



cos 



-/3(7a(e,0) 



apo 



holo 



(6) 



This is our central theoretical result. Thus, estimating 
AGbind in the current framework requires three compu- 
tations: the PMF must be calculated (detailed below), 
AG^°'° must be approximated, and the apo integral must 
be analytically evaluated. 

One may note that a different, yet viable, non- 
equilibrium approach would be to set fca = 0, i.e., remove 
the axial restraint. This would simplify the AGbind cal- 
culation since AGjj°^° = and the apo integral would be 
47r in Eq. (|6|, and thus only the PMF would be needed. 
Further, the binding axis would not need to be defined by 
the researcher, which would allow the ligand more flexible 
unbinding routes. However, there are two important ad- 
vantages to using the axial restraint. Most important is 
that since the axial restraint limits the allowable config- 
urational space for the ligand, the PMF converges much 
more quickly than with no restraint. Also, in cases where 
the binding pocket is not at the protein surface, or when 
more than one viable pathway exists, it may be advan- 
tageous, or even necessary, to define the binding path to 
obtain meaningful results. 



A. Estimating the PMF 

The PMF will be computed using two different ap- 
proaches: the Hummer-Szabo method, and the stiff- 
spring approximation with the second cumulant expan- 
sion. Below we summarize these techniques. 

In the Hummer-Szabo approach the PMF is estimated 
by performing multiple non-equilibrium pulling simula- 
tions along the reaction coordinate r by using a time- 
dependent biasing potential Ur{r,t) — kr[r — (rg -I- vt)]'^, 
where kr is the force constant, r — r(t) is the protein- 
ligand center of mass separation, tq is an initial reference 
separation which is constant for all pulling simulations 
(i.e., ro ^ ''(0)), and v is the speed at which the biasing 
center is moved. The PMF is then estimated viS^'^ 



-/3W(r) _ 



(S[r - rit)] eM-PWt)) 



{exp{-f3Wt)) 
exp(-/3£/,(r,t)) 
(expi-pWt)) 



X e 



-21n(r) 



(7) 



where the sum is over time slices t, and the 21n(r) term 
is the Jacobian correction which is necessary since r is 
a radial distancd^. The (. . .) indicates an ensemble av- 
erage for pulling simulations drawn from the Boltzmann 
distribution corresponding to the initial system potential 



energy ?7tot(^, X,t = 0) = U{x, X) + U,{e, <^) + U,{r, 0). 
The work for a given time slice is given by "^'^ 



dr 



k.rV 



[r{t') - ro]dt' 



UMO),0)- (8) 



Note that Wt is the accumulated work minus the initial 
t ~ biasing energy. 

The stiff-spring approx imation utilizes the well-known 
Jarzynski equalitjl^MsiiZI ^ estimate the PMF. The ap- 
proximation is that for a sufficiently large force constant 
kr that the protein-ligand separation closely follows the 
biasing center, i.e., ^ = 4- ~ r. Park and Schul- 
ten thus concluded that the accumulated work along the 
reaction coordinate r is approximat ely eq ual to the ac- 
cumulated work for a given time 



-/3W(r) 



,-/3AG(r) _ /„-0W, 



X e 



-21n(r) 



(9) 



where the 21n(r) term is necessary due to the Jacobian 
correctioiJ^ and the work is determined by integrating 
the biasing force over the location of the bias center 



J r. 



ro+vt 



dUr{r,t) dr 



ro 9r 



ro 



dr 



de-t/..(r(0),0) 
dC-f/.(r(0),0), (10) 



where we have used the fact that dr/d^ « 1. Apply- 
ing the cumulant expansion to Eq. ([9]), we obtain the fi- 
nal expr ession used estimate the PMF for the stiff-spring 
approacbP^'^ 

WW « {Wr) - ^m^) ~ (W.)'] +21n(r) . (11) 



For the results given in this report the ligand is 
pulled out of the binding pocket, and the reverse pro- 
cess of pulling the ligand into the pocket is not con- 
sidered. Future studies will include the reverse pro- 
cess since the use of bi-directional simulation has been 

shown to be an effective a pproach to accurate AG 
estimatioiP21i i | 42 | 50 | 5i | 52 | 53 | 54 ] 

We note two aspects of the relationships embodied in 
Eqs. ^ and ([TT]): (i) The equahty in Eq. Q holds only in 
the case of obtaining all possible pulling trajectories. The 
approximation in Eq. ( 1 1 1 is an equality for the case that 
the work value distribution is perfectly Gaussian. Thus, 



it is important to calculate uncertainty estimates for the 
PMF, and if possible, to compare results to an indepen- 
dent computational measure — below we will compare our 
results to use of umbrella sampling, (ii) The relation is 
independent of the speed at which the system is forced, 
i.e., the unbinding speed. In practice, however, it has 
been found that the speed chosen can dra maticall y affect 
the convergence behavior of the estimateJ^SHmU, 
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B. Use of a physical pathway 

It is useful to consider the advantages and disadvan- 
tages of using a physical (rather than alchemical) path- 
way. The regions of configurational space correspond- 
ing to apo and holo in Eq. ([T]) are well-separated with 
no overlap, thus a pathway connecting them is typically 
created. For our discussion below, this pathway will be 
parameterized using the variable A. 

In the case of a physical pathway, such as in the current 
study, X = r represents the protein-ligand separation. 
By contrast, for an alchemical pathway A is generally 
a parameter that scales the strength of the interactions 
between the ligand and rest of the system. 

Our use of a physical pathway is motivated by several 
factors. Alchemical pathways are typically much more 
difficult to implement than physical pathways since inter- 
actions must be scaled carefully. In addition, restraints 
must often be employed such that the non-interacting 
parts do not drift away from the region of interest. 

We note that there are disadvantages to using physical 
pathways. Physical pathways may require the researcher 
to determine the pulling direction such that the ligand 
exits the binding pocket, i.e., determined by choice of C/a 
in this report. Alchemical pathways do not require such 
a choice. Perhaps most important, physical pathways 
require larger system sizes when explicit solvent is used, 
as in the current report. The size of the system must be 
large enough that the hgand can be pulled to a distance 
such that interactions between the ligand and protein are 
negligible. 

In cases where the binding site is buried deep within 
the protein, alchemical methods should be much more 
efficient than physical approaches. However, when the 
binding pocket is close to the protein surface, as for the 
current study, it is not clear where alchemical or physical 
approaches are more efficient and/or accurate. 

Another important consideration is that the use of a 
physical pathway allows the researcher to determine the 
PMF along the pathway. This PMF can give insights 
into binding that are simply not possible when using al- 
chemical methods, e.g., determining the preferr ed bin d- 
ing pathway when multiple pathways are presentP2i22l. 

C. Use of a non-equihbrium approach 

Non-equilibrium approaches, such as used in the cur- 
rent study, rely on computing the work required to force 
the system from one state to the other rapidly enough 
that equilibrium is not attained at any value of A. This 
process is repeated many times and the resulting distri- 
bution of work values is used to estimate AG^*. By con- 
trast, equilibrium free energy methodologies such as ther- 
modynamic integratioip3!, free energy perturbatioij ^^ * ^^ !, 
and weighted histogram analysis^^, share the common 
strategy of generating equilibrium ensembles of config- 
urations for multiple values of the scaling parameter A. 



It is important when performing such AG computation 
that enough simulation time is spent to equilibrate at 
each value of A such that the resulting ensemble is valid 
for the current A. 

It is not currently known whether equilibrium or non- 
equilibrium methodologies offer an efficiency advantage 
for typical protein-ligand binding affinity computa- 
tion. Equilibrium methods have been widely used 
to gen erate accurate AG estimates for protein-Ug and 

]3^^j^j^j J2l3l4l5l6l7l8l9ll0lllll2ll3ll4ll5ll6ll7ll8ll9l20l21l22l23l 

However, if equilibrium is not attained the resulting 
AG estimate ca n be heavily biased. With few very 
recent exceptiond^^IMIIIlMl] non-equilibrium methods 
are largely untested on protein-ligand systems. In 
previous calculations of relative solvation free energies 
non-equilibrium methods were proven to be equal or 
superior in efficiency to commonly used equilibrium 
methodfpf. 

A key advantage of non-equilibrium methodologies is 
the ease that one can parallelize the AG calculation. 
Since each work value must necessarily be generated in- 
dependently, the corresponding simulations can be run 
in parallel with no loss of accuracy to the final AG es- 
timate. Equilibrium AG computations, by contrast, are 
not trivially parallelizeable. One can imagine perform- 
ing each A simulation in parallel, however one must be 
very careful about the configurations used to start each 
A simulation. In typical cases it is necessary to start the 
current A simulation using the final snapshot from the 
previous A simulation; thus, the A simulations are per- 
formed in a serial fashion. If this is not done, the amount 
of time needed to equilibrate at each value of A could be 
heavily dependent on the chosen starting structures. The 
AG estimate could be heavily biased if the time spent for 
equilibration at each A value is inadequate. 



III. METHODS 

A. Computational details 

The initial coordinates for the FKBP- ligand complexes 
were obtained from the Protein Data BanlP^ 1D7H for 
FKBP-DMSO, and 1D7J for FKBP-BUQ. The topolo- 
gies for DMSO and BUQ were then generated by the 
PRODRG servei^^, with partial charges modified by the 
author. 

The GROMACS simulation package version 3.3.^ 
was used with the default GROMOS-96 43A1 
forcefield^^. The software was slightly modified to 
provide the biasing potential which depends only on 
the center of mass separation between the ligand and 
the protein. Protonation states for the histidine residues 
were selected by the GROMACS program pdb2gmx: 
HIS25 was protonated at and HIS87 and HIS94 

were protonated at Ne2. The protein-ligand complexes 
were then solvated in a cubic box of SPG watei'^ of 
approximate initial size 6.8 nm a side. A single chloride 
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ion was randomly placed in each water box to give a 
net neutral charge, and then each system was minimized 
using steepest decent for 500 steps. To allow for equi- 
libration of the water, each system was then simulated 
for 1.0 ns with the positions of all heavy atoms in the 
ligand and protein harmonically restrained with a force 
constant of 1000 kJ/mol/nm^. The temperature was 
maintained at 300 K using Langevin dynamicd^ with 
a friction coefficient of 1.0 amu/ps. The pressure was 
maintained at 1.0 atm using the Berendsen algorithnJ^. 
Wc note that the Berendsen algorithm does not produce 
canonically distributed structures, however, none of the 
resulting simulation frames were used for generating AG 
estimates, as will be seen below. The LINGS algorithnP 
was used to constrain hydrogens to their ideal lengths 
and heavy hydrogens were used — the hydrogen mass 
was increased by a factor of four and this increase was 
subtracted from the bonded heavy atom so that the 
mass of the system remained unchanged — allowing the 
use of a 4.0 fs timestep. Particle mesh EwalcP^ was used 
for electrostatics with a real-space cutoff of 1.0 nm and 
a Fourier spacing of 0.1 nm. Van der Waals interactions 
used a cutoff with a smoothing function such that the 
interactions smoothly decayed to zero between 0.75 nm 
and 0.9 nm. Dispersion corrections for the energy and 
pressure were utilizecPI. 

After the position restrained simulation, a 4.0 ns equi- 
librium simulation at constant temperature and volume, 
with restraints, was used to generate starting configu- 
rations for use in the PMF calculations. Each FKBP- 
protein complex was simulated with parameters chosen 
identical to the position restrained simulation above ex- 
cept that the volume was fixed at the value of the fi- 
nal configuration from the position restrained simula- 
tions. Importantly, for Eq. ([t]) and (11 1 to be used these 
equilibrium simulations must include the restraints, i.e., 
both Ua and were present. For both the DMSO and 
BUQ systems the axial restraint used a force constant 
ka = 1000 kJ/mol/nm^, and Oo,4>o were chosen to be 
equal to the values from the final snapshot of the posi- 
tion restrained simulation. For both systems the biasing 
potential Ui- used a force constant kr = 3000 kJ/mol/nm^ 
and To = 0.5 nm and a speed v = 0. 

Starting structures for the unbinding simulations were 
chosen to be equally spaced within the 4.0 ns equilib- 
rium simulation. So, if 40 starting structures were de- 
sired, then the spacing between snapshots was 100 ps. 
The pulling simulations were performed using identical 
parameters to the 4.0 ns equilibrium simulation, except 
that the bias center was moved at a constant speed v 
ranging from 1.0 x 10^"* nm/ps to 8.0 x 10^^ nm/ps. 
The pulling simulations were discontinued when the bias 
center was at a position of 2.5 nm. 

The non-equilibrium unbinding simulations provided 
us with the protein ligand separation r at every time 
step, which we used to compute the work and thus the 
resulting PMF. An example of this is shown in Fig. [2] 
where we have used Eq. dsl to compute the work as a 
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FIG. 2: Results shown here are for a single non-equilibrium 
pulling simulation performed on the FKBP-DMSO system us- 
ing a pulling speed of 1.0 x 10~* nm/ps. The solid line shows 
the protein-ligand center of mass separation as a function of 
simulation time. The dashed line shows the accumulated work 
Wt as computed by Eq. ([8| as a function of simulation time. 



function of simulation time Wf. 



B. Computing AGj""'" 

We used the Bennett acceptance ratio approach 
to compute the free energy differences AG^^°^° as- 
sociated with the axial restraints'^. With the lig- 
and bound to the protein we performed 1.0 ns equi- 
librium simulations for each of the values ka — 
0, 25, 40, 60, 90, 150, 200, 300, 450, 700, 1000. The first 0.5 
ns of each simulation were discarded for equilibration, 
and the remaining 0.5 ns were used to compute AG|j°'°. 
We did not attempt to optimize efficiency of the AGq°'° 
computations, our only concern was accurate values, so it 
may be possible to reduce the total computational time 
from that described above. 



C. Uncertainty estimation 

We estimated the uncertainty in our AGbind estimates 
using the bootstrap approach applied to the PMF: (i) 
The reference value of the PMF given by >V(r*) was 
computed via Eq. (Ill using N work values chosen at 



random (with replacement) from a dataset containing N 
values; (ii) The above step was repeated until the mean 
and standard deviation of the free energy estimates con- 
verged; around 100,000 trials in our study, (iii) The un- 
certainty is given by the converged standard deviation of 
the free energy estimates. 

For comparison, we also used the uncertainty analy- 
sis obtained by Zuckerman and WoolP^, and the Busta- 
mante group^^. These uncertainty estimates are reported 
to be accurate when the variance in the estimate domi- 
nates over the bias (as in the case of large N) . 
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D. Generating a target PMF 

Since the purpose of the current study was to test 
the effectiveness of non-equihbrium strategies it is im- 
portant to have an independent estimate of the PMF. 
Thus, we computed the PMF using umbrella sampling 
and weighted histogram analysis (WHAM|^. Simula- 
tions were performed with the restraints and Uj^. For 
the umbrella sampling simulations the speed was set 
to V = 0, all other parameters were identical to the 
non-equilibrium simulations, and 41 windows were used 
ro = 0.50,0.55,0.60, ...,2.45,2.50. Each window was sim- 
ulated for a total time of 12 ns; 6 ns were discarded for 
equilibration and 6 ns were used for the WHAM anal- 
ysis. Thus, the total simulation time was nearly three 
times greater than the non-equilibrium simulations de- 
tailed above. No attempt was made to test the efficiency 
since the goal was to generate the most accurate PMF. 
Note that the 21n(r) Jacobian correction from Eqs. Q 
and (111 were also used for the target PMF. 



IV. RESULTS AND DISCUSSION 

The results of this study are very encouraging. Us- 
ing the non-equilibrium methodology outlined above we 
estimated the the binding affinity for the FKBP-DMSO 
and FKBP-BUQ complexes typically within less than 4.0 
kJ/mol of the target values; and the target values are 
within less than 1.0 kJ/mol of experiment. 

Figure |3]shows the PMF as a function of protein- ligand 
separation for all systems studied here. Data is shown for 
both DMSO and BUQ systems, with pulling speeds indi- 
cated on each plot. Note that the same amount of total 
simulation time was spent on each non-equilibrium PMF 
estimate, but that the target PMF utilized three times as 
much simulation as the non-equilibrium estimates, thus 
we are not attempting to compare non-equilibrium and 
equilibrium approaches in this study. For both systems 
the non-equilibrium estimates tend to overestimate the 
target PMF, and underestimate the broadness of the 
PMF minimum. This suggests that the pulling speeds 
were not slow enough to properly sample the PMF. Also, 
use of stiff-spring approximation with the second cumu- 
lant expansion does tend to improve the non-equilibrium 
PMF curves. 

Table [l| shows the binding affinity results obtained 
via Eq. Ml, with the PMF computed using both the 
Hummer-Szabo approach of Eq. ([7| and the stiff-spring 
second cumulant approximation of Eq. (111. The com- 
putational estimates of AGbind for the target PMF are 
in excellent agreement with experimental data. The non- 
equilibrium estimates are typically within less than 4.0 
kJ/mol of the target values. Reference distances were 
chosen as = 2.4 nm for both DMSO and BUQ, and 
the value of the restraint free energy was found to be 
AGjj°'° = -9.0 kJ/mol for FKBP-DMSO and AG|;°'° = 
-7.8 kJ/mol for FKBP-BUQ. Finally the value of the apo 
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FIG. 3: (color online) The PMF as a function of the protein- 
ligand center of mass separation. Tliese PMF curves were 
numerically integrated for Eq. ([6| and used to generated the 
AGbind estimates shown in Table |l] All non-equilibrium re- 
sults utilized the same total amount of simulation time. For 
all plots the light colored solid curve shows the target PMF 
generated via equilibrium umbrella sampling and WHAM. 
The target PMF curves utilized three times more simula- 
tion time as the non-equilibrium simulations, and thus we are 
not attempting to compare the accuracy of equilibrium and 
non-equilibrium approaches, (a) FKBP-DMSO system using 
the Hummer-Szabo approach of Eq. Q. (b) FKBP-DMSO 
system using the stiff-spring second cumulant expansion ap- 
proximation of Eq. (111. (c) FKBP-BUQ system using the 
Hummer-Szabo approach of Eq. ([7|. (d) FKBP-BUQ system 
using the st iff-s pring second cumulant expansion approxima- 
tion of Eq. fTTl). 



integral in Eq. ^ was computed analytically to be 10.6 
kJ/mol for FKBP-DMSO and 10.7 kJ/mol for FKBP- 
BUQ. The results show that non-equilibrium estimates 
for the FKBP-DMSO system are more accurate than the 
FKBP-BUQ estimates suggesting that the FKBP-BUQ 
system requires more simulation time to converge. Un- 
certainty estimates were obtained using both a bootstrap 
method and the approach described in Refs. 1551551 

Table H] includes the standard deviation of the work 
values aw measured at the reference distance — 2.4 
nm. Previous studies have suggested that the optimal ef- 
ficiency for use of the Jarzynski relation is when the speed 
is slow enough that aw ~ 1.0 keT w 2.5 kJ/mol^^ 4^ 
Apparently the speeds attempted for the current study 
were not slow enough to generate work values with such 
a small aw- We note however, for the current study, the 
uncertainty does not appear to correlate with aw- Fu- 
ture studies will be carried out to determine if there is 
an optimal pulling speed for these systems. 

The results from Tab. |T] suggest that use of the 
Hummer-Szabo approach, while exact in the limit of infi- 
nite sampling, is not feasible for the current study. This 
is likely due to the fact that the pulling speeds used were 
too fast to generated work values with aw ~ 1-0 ksT. 
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Ligand A*' Speed (nm/ps) 


aw 


AGbind" 


AGbind'' 


Uncty'^ Uncty'* 


Targef^ Exp 


DMSO 10 


1.0 X lO"* 


6.9 


-12.6 


-11.1 


3.5 


2.4 


-9.2 -9.7 


20 


2.0 X 10~* 


7.9 


-15.6 


-10.7 


1.2 


1.0 




40 


4.0 X 10"* 


9.4 


-16.3 


-8.2 


1.6 


1.5 




80 


8.0 X 10""* 


10.4 


-15.9 


-9.2 


2.2 


1.7 




BUQ 10 


1.0 X lO""" 


7.8 


-30.0 


-26.7 


3.1 


2.2 


-18.3 -18.9 


20 


2.0 X 10""* 


11.9 


-23.2 


-16.0 


4.7 


2.4 




40 


4.0 X 10""* 


8.1 


-28.4 


-28.2 


3.8 


2.4 




80 


8.0 X 10-* 


11.3 


-36.0 


-22.3 


1.7 


1.5 





TABLE I: Comparison between non-equilibrium binding affinity estimates and target and experimental binding affinities. 
All energy values are shown in units of kj/mol. The first column describes the ligand used. The second column contains 
the number of work values A'^ used in the estimate, and the third and fourth columns are respectively, the corresponding 
speed of the restraint attached to the ligand, and the standard deviation of the work values. The rightmost column gives the 
experimental results reported in Ref. 134] 
Binding affinity estimate obtained via the Hummer-Szabo approach using Eqs. Q and m. 

Binding affinity estimate obtained using the stiff-spring second cumulant expansion approximation using Eqs. ^ and ( [Tl| . 

Uncertainty estimate computed via the bootstrap method. 

Uncertainty estimate computed from the approach described in Refs. I55I65I 



However, use of the approximate stiff-spring second cu- 
mulant expansion, while approximate, does tend to im- 
prove the AGbind estimates. This is consistent with the 
recent study by Minh and McCammon which determined 
that when similar speeds are utihzcd the stiff-spring sec- 
ond cumulant expansion method performed better than 
the other tested methods®^. 

We realize that the use of larger more flexible ligands 
may lead to difficulties in using the method suggested 
here. This is due to the large number of possible con- 
formations the ligand may adopt in the apo state; all of 
which must be sampled adequately to obtain an accurate 
PMF. However, the method may be modified by includ- 
ing an additional restraint to the RMSD of the ligand, 
thus restricting the conformational freedom of the ligand. 
The free energy of release from this RMSD restra int mus t 
then be included in the binding affinity estimat^^^'^^'^ll. 



A. Note on simulation time 

Each non-equilibrium estimate in Table |l] was gener- 
ated using a total simulation time of 216.0 ns (1.0 ns posi- 
tion restrained -I- 4.0 ns equilibrium to generate starting 
configurations + 200.0 ns for unbinding simulations + 
11.0 ns for A 6*^°'° estimation). Note however, that the 
unbinding simulations were performed in parallel. So, 
for example, at a speed of 2.0 x 10"^ nm/ps, twenty in- 
dependent 10.0 ns simulations were performed in paral- 
lel. Therefore, all the simulation data needed to compute 
AGbind can can be obtained in around a day of wall-clock 
time with the use of a computer cluster. 



V. CONCLUSIONS 

We have demonstrated that non-equilibrium unbinding 
simulations utilizing a physical pathway can be used to 
generate estimates of the binding affinity for the FKBP- 
DMSO and FKBP-BUQ systems studied here. The non- 
equilibrium estimates are typically within less than 4.0 
kJ/mol of the target values; and the target values are 
within less than 1.0 kJ/mol of experiment. 

Our results suggest that when the standard deviation 
of the work values is larger than the optimal aw ~ 
1.0 ksT that the stiff-spring second cumulant expansion 
approximation provides a better AGbind estimate than 
the exact Hummer-Szabo method. 

The importance of pursuing methods such as described 
here is that such non-equilibrium approaches are trivially 
parallelizeable since each unbinding simulation is per- 
formed independently. Also, due to the use of a physical 
pathway, the method is simple to implement in many ex- 
isting simulation packages with little or no modification 
to the software. 

We note that the method described here is not ex- 
pected to produce accurate binding affinities when the 
ligand is large and flexible. In this case, it is necessary 
to extend the approach to include additional restraints 
to the ligand during the unbinding simulation to prevent 
large-scale fluctuations. The contribution to the bind- 
ing affinity from t hese add itional restraints must then be 
taken into accountP- ^ l ^^ l ^^ l 

The results obtained here suggest that non-equilibrium 
unbinding simulations can be used to generate accurate 
estimates of binding affinities. Efficiency analysis and 
comparison to other methodologies will be carried out in 
future work. 
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